Can transcriptome size be estimated from SAGE catalogs?
نویسندگان
چکیده
MOTIVATION SAGE (Serial Analysis of Gene Expression) can be used to estimate the number of unique transcripts in a transcriptome. A simple estimator that corrects for sequencing and sampling errors was applied to a SAGE library (137 832 tags) obtained from mouse embryonic stem cells, and also to Monte Carlo simulated libraries generated using assumed distributions of 'true' expression levels consistent with the data. RESULTS When the corrected data themselves were taken as the underlying model of 'ground truth', the estimator converged to the 'true' value (53 535) only after counting 300 000 simulated tags, more than twice the number in the experiment. The SAGE data could also be well fit by a Monte Carlo model based on a truncated inverse-square distribution of expression levels, with 130 000 'true' transcripts and 10(6) samples needed for convergence. We conclude that the size of a transcriptome is ill-determined from SAGE libraries of even moderately large size. In order to obtain a valid estimate, one must sample a number of tags inversely proportional to the lowest abundance level, which is not known a priori. This constrains the design of SAGE experiments intended to determine biological complexity. AVAILABILITY The 'homemade' software used for this analysis was not designed for general or 'production' use, but the authors will be happy to share Fortran sourcecode with interested parties. CONTACT [email protected]
منابع مشابه
Effect of cutting size and position on propagation ability of Sage (Salvia officinalis L.)
The investigation was conducted on wet and dry seasons of Ethiopia during the year 2012/2013 at Wondo Genet Agricultural Research Center Nursery site. Four levels of cutting size and three levels of cutting positions were arranged in randomized complete block design with three replications. Data on seedling height, branch number/seedling, root number/seedling, root weigh/seedling, root length/s...
متن کاملEffect of cutting size and position on propagation ability of Sage (Salvia officinalis L.)
The investigation was conducted on wet and dry seasons of Ethiopia during the year 2012/2013 at Wondo Genet Agricultural Research Center Nursery site. Four levels of cutting size and three levels of cutting positions were arranged in randomized complete block design with three replications. Data on seedling height, branch number/seedling, root number/seedling, root weigh/seedling, root length/s...
متن کاملA human glomerular SAGE transcriptome database
BACKGROUND To facilitate in the identification of gene products important in regulating renal glomerular structure and function, we have produced an annotated transcriptome database for normal human glomeruli using the SAGE approach. DESCRIPTION The database contains 22,907 unique SAGE tag sequences, with a total tag count of 48,905. For each SAGE tag, the ratio of its frequency in glomeruli ...
متن کاملTranscriptome annotation using tandem SAGE tags
Analysis of several million expressed gene signatures (tags) revealed an increasing number of different sequences, largely exceeding that of annotated genes in mammalian genomes. Serial analysis of gene expression (SAGE) can reveal new Poly(A) RNAs transcribed from previously unrecognized chromosomal regions. However, conventional SAGE tags are too short to identify unambiguously unique sites i...
متن کاملReproducibility, bioinformatic analysis and power of the SAGE method to evaluate changes in transcriptome
The serial analysis of gene expression (SAGE) method is used to study global gene expression in cells or tissues in various experimental conditions. However, its reproducibility has not yet been definitively assessed. In this study, we have evaluated the reproducibility of the SAGE method and identified the factors that affect it. The determination coefficient (R2 ) for the reproducibility of S...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 19 4 شماره
صفحات -
تاریخ انتشار 2003